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(57) ABSTRACT 

A method, apparatus , an d an article of m anu facture for 
parallel Execution of SQL operations from stored proce- 
dures. One or more embodiments of the invention provide 
the stored procedure (stored procedure) with a C++ class 
(hereinafter referred to as "dispatcher") that can take an SQL 
query and start parallel execution of the query. The query is 
optimized and parallelized. The dispatcher executes the 
query, sets up the communication links between the various 
operators in the query, and ensures that all the results are sent 
back to the data-server that originated the query request. 
Further, the dispatcher merges the results of the parallel 
execution and produces a single stream of tuples that is fed 
to the calling stored procedure. To provide the single stream 
to the calling stored procedure, one or more embodiments of 
the invention utilize a class that provides the stored proce- 
dure with a simple and easy-to-use interface to access the 
results of the nested SQL execution. 
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METHOD AND APPARATUS FOR PARALLEL 
EXECUTION OF SQL FROM STORED 
PROCEDURES 

CROSS REFERENCE TO RELATED 
APPLICATIONS 

This application is related to the following co-pending 
and commonly assigned patent applications, which are 
incorporated by reference herein: 

Application Sen No. 09/470,215, entitled "METHOD 
AND APPARATUS FOR PARALLEL EXECUTION OF 
SQL FROM WITHIN USER DEFINED FUNCTIONS" 
filed on the same date herewith, by Navin Kabra, Jignesh 
Patel, Jie-Bing Yu, Biswadeep Nag, and Jian-Jun Chen,now 
pending. 

Application Ser. No. 09/449,704, entitled "METHOD 
AND APPARATUS FOR FETCHING ARRAY BASED 
OBJECTS BY DIRECT DELIVERY AND BATCHING" 
filed on Nov. 24, 1999, by Zhe Wang, Biswadeep Nag, 
Jie-Bing Yu, and Jignesh Patel, now pending. 

Application Ser. No. 09/449,085, entitled "QUERY 
MONITOR PLAYBACK MECHANISM FOR POST- 
MORTEM PERFORMANCE ANALYSIS" filed on Nov. 
24, 1999, by Jie-Bing Yu, Jun Li, and Karthikeyan 
Ramasamy, now pending. 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

TV pre Rffr** i™»*"tifl n relates to sy s t grns-and mefhodsjpr 
performing queries on data stored in a database, and in 
particular to a method an dLsvstem Jor executing sQlT from 
store d procedures . 

2. Description of the Related Art 

Hie ability to manage massive amounts of information 
has become a virtual necessity in business today. The 
information and data are often stored in related files. A set 
of related files is referred to as a database. A database 
management system DBMS) creates and manages one or 
more databases. Today, DBMSs can manage any form of 
data including text, images, sound and video. Further, large- 
scale integrated DBMS' provide an efficient, consistent, and 
secure means for storing and retrieving the vast amounts of 
data. 

Certain computer languages have been developed and 
utilized to interact with and manipulate the data. For 
example, SQL (Structured Query Language) is a language 
used to interrogate and process data in a relational database 
(a database in which relationships are established between 
files and information stored in the database). Originally 
developed for mainframes, most database systems designed 
for client/sever environments support SQL. SQL commands 
can be used to interactively work with a database or can be 
embedded within a programming language to interface to a 
database. Thus, methods and functions may embed and 
utilize SQL commands. 
Stored Procedures 

Users/programmers often need to invoke the same set of 
commands (or the same set of commands with different 
parameters) at varying times and locations of a program. In 
such a situation, the query may be placed into a stored 
procedure. A stored procedure is a batch of SQL statements 
stored in a database/on a server, that may be partially or fully 
processed/compiled before it is stored (or upon its first 
invocation). Additionally, a stored procedure is a method or 
procedure written in any programming language that is 
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partially or fully processed/compiled before it is stored (or 
upon its first invocation). 

Stored procedures may be called directly from a client or 
from a database trigger and are often stored on the server. A 

5 database trigger is a user defined mechanism that causes a 
stored procedure to automatically initiate and execute upon 
the occurrence of the user specified events in the database 
(i.e., when the trigger "fires"). Thus, the trigger will not 
"fire" unless the event(s) specified by the user occurs. For 
example, a user may define a trigger to automatically "fire" 
whenever a user updates, deletes, or inserts data. 

Since the stored procedure is stored on a server, the stored 
procedure is available to all clients and does not need to be 
replicated in each client. Further, by storing the stored 
procedure on the server, when the stored procedure is 

15 modified, all clients automatically get/have access to the 
new version. This saves programming effort especially when 
different client user interfaces and development systems are 
used. Further, this allows stored procedures to be an easy 
mechanism for sharing complex queries and functions 

20 between multiple applications. Additionally, SQL and stored 
procedures may call other stored procedures and may be 
written independently from (and without knowledge of) the 
underlying DBMS. 
A stored procedure may be partially or completely 

25 processed/compiled before it is stored on the database. 
Consequently, the stored procedure does not have to be 
parsed and compiled each time it is invoked. Further, 
because a stored procedure is stored in a compiled format, 
it executes faster than if its constituent commands were 

30 executed individually. 

Alternatively, a stored procedure may not be compiled 
prior to storage but may be automatically compiled the first 
time the procedure is invoked. As part of such a compilation, 
a query execution plan may be generated. The query execu- 

35 tion plan describes the order in which tables are to be 
accessed and the indexes to be used. Further, the query 
execution plan is optimized for the stored procedure param- 
eters and data in the database tables at the time the stored 
procedure is first executed. 

40 A stored procedure may be invoked by its name. The 
caller can pass parameters to and receive results from the 
stored procedure. A user can create and name a stored 
procedure to execute specific database queries and perform 
other database tasks. For example, a user may create a stored 

4S procedure that returns the number of videos of a particular 
movie remaining in a video store for the video title that is 
specified at the time the stored procedure is called. 

Stored procedures may also maintain the integrity of the 
database and prevent unauthorized users from modifying 

50 certain entries. For example, a user may be given the right 
to call a stored procedure that updates a table or set of tables 
but denied the right to update the tables directly. 

Stored procedures may be created using a variety of 
mechanisms. The following format may be utilized to 

ss declare a stored procedure: 


create proc (procedure name} 

60 " r 

{statement of block of statements} 


For example the following stored procedure called rayproc 
will return the number of Casablanca videos left in a video 
65 store as well as other movie titles, the rental price of those 
movie titles, and the location of those videos when Hum- 
phrey Bogart is an actor in the movie: 
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create proc myproc 
as 

begin 

select inv_videos 
from video 

where tide id - "Casablanca" 

select title id, rental price, location 

from video 

where actor - "Humphrey Bo gait" 

end 


In this example, the user would see the same results as if the 
following command were utilized: 


select inv_videos, title id, rental price, location 

from video 

where title id -> "Casablanca" 

or actor ■ "Humphrey Bogart" 


10 


15 


As described above, stored procedures can also be passed 
parameters. Parameters may be defined as part of the stored 
procedure creation statement The syntax of a "create proc" 
command with parameters is: 


20 


Additionally, the "exec" portion of an invocation of a stored 
procedure is not necessary if the stored procedure call is the 
first line in a batch. 

Stored procedures can also have a "return" status. A 
"return" statement returns from the stored procedure with an 
optional status parameter. The return status is zero for 
success, or negative otherwise. Negative values between -1 
and -99 are reserved. For example, the following stored 
procedure returns a negative value (-999) if no rows/records 
are in the result set and a 0 if rows/records are in the result 
set: 


create proc {proc name} 

( @{param_name} {param__type}, 
@{param_name} {param_type}, 
{-..} 

) 

as 

{statement} 


30 


For example, the following stored procedure may be passed 
the @mytitle and @myactor parameters for use in the select 
query: 


35 


create proc myproc 
( @mytiUe char, 
@myactor chai 
) 


begin 


end 


select inv_videos, title id, rental_price, location 

from video 

where title id «* @mytitle 

or actor - @myactoi 


45 


25 


create myproc ( @mytitle char, @myactoi chai) 
as 

begin 

select title id, rental price, location 

from video 

where title id a @mytitle 

or actor ■* @myactor 
if @@rowcount - 0 

return -999 

else 

return 0 

end 


The following commands illustrate the invocation of the 
above stored procedure with a return status: 


40 declare @mystatus int 

exec @mystatus ** myproc "Casablanca", 

if @mystatus 1 = 0 

begin 

{do error processing} 

end 


"Humphrey Bogart" 


else 


begin 


end 


(do normal processing} 


Once a stored procedure has been created, a user can invoke 50 In addition to providing parameters when invoking a 
the stored procedure using the following syntax: stored procedure, default values for the parameters may be 

specified: 


exec [database.owner.] {procname} {opt params} 


55 


In the above syntax, "database" and "owner" will default to 
the current database and the current dbo (database owner). 
For example, the following command may invoke the stored 
procedure "myproc" defined above: 


exec myproc "Casablanca" "Humphrey Bogart" 


create proc myproc 

( @myparaml 
@myparam2 
@myparam3 
) 


int- 0 
int - null, 
char(20) - 


'myde fault" 


The above stored procedure provides a default value of 0 for 
@myparaml, a default value of null for @myparam2, and a 
default value of "mydefaulr" for @myparam3. To invoke the 
65 stored procedure, the user may specify the values for one or 
more parameters, if desired, or may allow one or more of the 
default values to be utilized: 
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exec myproc 1 /• paranu 2 & 3 default V 


In the above invocation, only @myparaml is specified in the 
execution line. Consequently, the specified default values for 
@myparam2 and @myparam3 are utilized (i.e., null and 
"mydefault"). 

Additionally, stored procedures can have output param- 
eters as demonstrated below: 


create proc myproc 

( @myparaml int 
@myoutparam int output 

) 

as 

begin 

select ©myoutparam = count (*) 
from titles 

where x > @myparaml 

end 


The output for the above stored procedure is the number of 
rows (i.e., the count) in the titles table where the value of x 
is greater than @myparaml. 

To further optimize the processing time for working with 
and manipulating the data, some DBMS have distributed the 
data and provided for parallel processing of and to the data. 
Thus, the stored procedures utilized to manipulate and work 
with the data are executed in parallel on the parallelized/ 
distributed data. Some stored procedures are associated 
directly with certain types of data on a particular data server 
(storage location for the data). However, these stored pro- 
cedures may attempt to manipulate and retrieve information 
from data not located on the data server where the stored 
procedure is located. Accordingly, it is difficult to start up 
parallel execution of a stored procedure that resides on any 
one data server. 

Further, since the stored procedures may be written inde- 
pendently from (and without knowledge of the parallelized 
data system, it is difficult to provide results to the stored 
procedure in a clean manner. In other words, when a stored 
procedure operates or requests data (i.e., using SQL 
commands), the interface within which the results are 
returned is difficult to establish and maintain without expos- 
ing the parallelism to the stored procedure. What is needed 
is a system and method for efficiently and cleanly executing 
SQL statements from stored procedures on a parallelized 
DBMS. 

SUMMARY OF THE INVENTION 

To address the requirements described above, the present 
invention discloses a method, apparatus, and an article of 
manufacture for parallel execution of SQL operations from 
stored procedures. 

The method comp rises providing the stored procedure 
with a C++ class (herei nafter referred 10 as "Uisp aicher'T ihat 
can take an s(}i] query arid Start parallel exbcuHon"f5f t he 
query. T he query Is optimized and paralleiizea. Th e dis- 
patcher executes the query, sets up the communication links 
between the various operators in Ihe" query, arid ensuresthat 
all the results are sent back to the data-server that originated 
jhe query reques t. Further, the dispatcher merges the results 
o f the parallel execution and produces a single stream o f 
t uples that is fed to the calling stored procedure, ' lb provide 
the single stream to the calling stored procedure, one or 
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more embodiments of the invention utilize a class that 
provides the stored procedure with a simple and easy-to-use 
interface to access the results of the nested SQL execution. 
In one or more embodiments of the invention, a C++ class 
5 such as the TOR InputStream class available from NCR 
Corporation, the assignee of the present invention is utilized. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Referring now to the drawings in which like reference 
10 numbers represent corresponding parts throughout: 

FIG. 1 is a block diagram showing an exemplary envi- 
ronment for practicing one or more embodiments of the 
present invention; 
15 FIG. 2 is a diagram illustrating the details of the query 
scheduler in accordance with one or more embodiments of 
the invention; 

FIG. 3 is a flow chart illustrating the operation of a 
parallelizer in accordance with one or more embodiments of 
20 the invention; 

FIG. 4 is a flow chart illustrating the operation of a 
dispatcher in accordance with one or more embodiments of 
the invention; 

FIG. 5 is a diagram showing one or more embodiments of 
25 the user front end of the exemplary hardware environment 
depicted in FIG. 1; 

FIGS. 6A and 6B are diagrams illustrating one or more 
embodiments of the invention; 

FIG. 7 is a block diagram showing an exemplary envi- 
ronment for practicing one or more embodiments of the 
present invention; 

FIG. 8 is a flow chart illustrating the operation of various 
nodes in accordance with one or more embodiments of the 
35 invention; and 

FIG. 9 is a flow chart illustrating the retrieval of query 
results in accordance with one or more embodiments of the 
invention, 

An DETAILED DESCRIPTION 

40 

In the following description, reference is made to the 
accompanying drawings which form a part hereof, and 
which is shown, by way of illustration, several embodiments 
of the present invention. It is understood that other embodi- 

45 ments may be utilized and structural changes may be made 
without departing from the scope of the present invention. 
Parallel Execution of SQL Outside of Stored Procedures 

In one, or more embodimen ts of the invention, a relational 
database or an object relational" database (e.g^ftradalatB 

so ubjeci Relational (TOR) database) may pe utilized, in an 
object relational database, tables of information may contain 
both coded (alphanumeric) data and multimedia object data. 
These tables are referred to as object-relational tables. 
Coded data is contained in table columns defined with 

55 traditional relational database data types, such as integer, 
character, floating point, and date. Objects are contained in 
table columns defined with Abstract Data Types (ADT) such 
as text, images, audio, and video. The objects persist in the 
table such that they may be retrieved and used instead of 

60 creating a new instance of an object. Further SQL commands 
may be utilized to interact with and manipulate the records/ 
information in the tables. Alternatively, data in an object 
relational database may be wrapped or encapsulated by an 
object that provides an interface to edit, delete, manipulate, 

65 etc. the data. 

FIG. 1 is a diagram showing an exemplary epvironrn ent 
in which one or more embodiments of 1 the invention provide 
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f or parallel execution of SQL commands. However, t he the execution rates of the different operators in the pipeline. 

underlying details of FIG. 1 flrp p ind^ efl jp provide fo r Network streams can be further specialized into split 

parallel execution of SQL when the SQL is from a store d streams, which are used to demultiplex an output stream into 

" procedure. T h e database system 100 uses a client-serv er multiple output streams based on a function being applied to 

architecture co mprising a query scheduler izl implemente d 5 each tuple. Split streams are one of the key mechanisms used 

in a que ry coordinator (QC) 104 and one or mor e data to parallelize queries. Since all types of streams are derived 

servers (US) 130 A-130E (hereinafter refenreH to'asAfat a from a base stream class, their interfaces are identical and 

servet(s) 130) storing data i n one or more data storag e the implementation of each operator can be totally isolated 

devices 132A-1J2K (herei nalter referred to as data stora ge from the type of stream it reads or writes. At runtime, the 

d evice(s) lii. lfre data servers 1^0 also perform portions of 10 scheduler thread (running in the QC process 104 through 

the execution plan in execution threads as determin e? by t he Query Scheduler 122), which is used to control the parallel 

query coordinator 11M to exeCuf e~Tfae^uel^The query execution of the query, instantiates the correct type of stream 

coor dinator 104 and data serv ers 13 0 ma>H5e"implemebt ed objects to connect the operators. Alternatively, when a stored 

irTseparate machines, or may be implemented as separate or procedure is executing, the stored procedure may instantiate 

"related processes in a single macFirieT pie ^OETlQ4 and the 15 the correct type of stream object in order to retrieve the 

j^T 130 co mmunicate via a communicauori^infrastructu re results from query execution (see description below). 

13Aw]uch ca n,autom aticaIly s elect the niost efficient mecha- F or the most part, the databa se system uses standar d 

nism for t he transpo rt of data between the UCi04 and any a lgorithm s to r each oi ^tfte basic relational operators. Indexed 

o n e ^ot the" l\V l^lt^lem entsjVhen a message is between selections are provided for bo th non-spa tial and_spatial 

processes or entities ferdo not share a common memory 20 selections. Jh'or loin operations, the query optimiz er 12<Pcan 

system, a transport protocol such as transmission control ^flfifi frnm qgftfcd loops, i ndexed n y te j_ jggP s > ^ 

protocol (TCP) or message passing interface (MPI) can be H riarP lV mprn orv hV hWH h *<rh 1 ni ' ns ^ ri«t»hjisr. system »s 

utilized to transfer the information. However, when the query optimizer 126 considers replicating small outer tables 

communication is between processors on a symmetric rmil- when an index exists on the join column of the inner table, 

^processing system (SMP), memory may be used as the 25 The database system uses a two-phase approach for jhe 

transport vehicle. parallel execution of aggregate operations. For examp le, 

Client processes 102, w hich can i nclude applications or consider a query revolving an average operator with a gro up 

graphi cal user interfaces( GUIs )^can connect to the QC 10 4 • by clause. During the first phase each part jcjpj rting thre ad 

tor submitting a query. After parsing and optimization, the processes itslragment of the input table producing a nmnin g 

QCT104 generates an execution plan (referred to as an ASCII 30 sum and count for each group. During the second phase a 

plan) for the query, performs further processing on the si ngle processor (typically) combines the results from~t he 

ASCII plan (discussed below), and transmits portions of that fi rst phase to produce an average value fo reach group ? 

plan to the appropriate data servers 130A-130E for execu- * Since standard Si(jL has a well defined set of aggregate 

tion. Hence, the QC 104 controls the parallel execution of operators, for each operator the functions that must be 

the query on the DS 130 processes. Query results including 35 performed during the first and second phases are known 

result sets are collected by the QC 104 for delivery back to when the system is being built and, hence, can be hard coded 

the client process 102. into the system. However, in the case of an object-relational 

The QC 104 and DS 130 processes can be implemented system that supports type extensibility, the set of aggregate 
as multithreaded processes on top of a storage manager 128. operators is not known in advance as each new type added 
The storage manager 128 provides storage volumes, files of 40 to the system may introduce new operators. Hence, a mecha- 
untyped objects, B+ trees and R* trees. Objects can be m>m is provided for specifying the first and second phase 
arbitrarily large, up to the size of the storage volume. In one function with the definition of each aggregate, 
embodiment, allocation of storage space within a storage The query coordinator 104 also comprises a tuple man- 
volume is performed in terms of fixed size extents. The ager 120, a catalog manager 118, a query optimizer 126, a 
associated I/O processes and the main storage manager 128 45 query scheduler 122, and a storage manager 128. The tuple 
server process share the storage manager 128 buffer pool, manager 120 receives the tuples from the data servers 130, 
which is kept in shared memory. formats and processes the tuples, and passes them along to 

The database system 100 uses many basic parallelism the client program 102. The catalog manager 118 manages 

mechanisms. Tables may be fully partitioned across all disks metadata regarding the tables and types in the database. The 

in the system 100 using round robin, hash, or spatial 50 query optimizer generates an execution plan (referred to as 

declustering. When a scan or selection query is executed, a an ASCII plan) for queries received from the client process 

separate thread is started for each fragment of each table. 102. 

In one embodiment, t he dfltnfrft y T system 100 a]$q )ises a Thus, in accordance with one or more embodiments of the 

push model of parallelism to implement partitioned exec u- invention, when a query comes into the system, the query is 

ti on in which tuples are pushed from leaves of the oper ator 55 sent to query optimizer 126 where the query is parsed/ 

tree upwar ds Eve ry database system 100 operator (e!gTj oin; typechecked, optimized, and generated into an ASCII plan (a 

sort, select,! . . ) takes its input from an input stream and sequential execution plan). The ASCII plan is then for- 

places its result tuples on an output stream, in e streams warded to query scheduler 122. As illustrated in FIG. 2 

themselves are L++ objects and can "be specialize d in th e query scheduler 122 contains a parallelizer 202 and a 

torm ol "hie streams" and u network streams' 1 . File streams 60 dispatcher 204. When the ASCII plan is forwarded from 

are used to read/writ e tuples from/to disk. Network streams query optimizer 126 to query scheduler 122, parallelizer 202 

are used to move data between operators either tnroug n receives the ASCII plan and generates a parallel execution 

shared-memo ry or across a communications networFvia a plan. The parallel execution plan is then sent to dispatcher 

transport protocol (e.g. I CP/IF or Mrlj. I n addition to 204. Dispatcher 204 performs the execution of the parallel 

providin g t ransparen t c ommunication b etween op erator ^ on 65 plan. 

t he same or differenLprocessors. n e twork s treams also FIG. 3 is a flow c hart illustrating the operation of paral- 

provide a flow-control mechanism that is usedlo"legulate lelizer 20^. At step 3Uli, the ASCII plan is read/received 
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from query optimizer 126. Once read, a new plan is devel- supports querying, browsing, and updating of database 

oped. At step 302, parallelizer 202 determines the placement objects through either its graphical or textual user interfaces, 

of the various operations on the various nodes (i.e., appro- In either case, the front end transforms a query into an 

priate data server 130) in the system. At step 304, split- extended SQL syntax and transmits it to the data server 130 

streams are inserted into the new plan at the appropriate 5 for execution. After executing the query, the query coordi- 

locations. Split-streams provide the ability to perform the nator 104 transmits the results back to the client program 

necessary declustering and partitioning of data servers 130 102 in the form of a set of tuples that can be iterated over 

and data storage devices 132. At step 306, any scans and using a cursor mechanism. In one embodiment, all commu- 

stores that Arc necessary are inserted into the plan. , nications between the front end 108 and the processes 

At step 308, the completed new plan is oota ined. in op e implemented in the query coordinator 104 are in the form of 

or mole~embodiments of the invention^ the ASCII planls remote procedure calls 114A and U4B implemented over a 

broken down and an "exec_plan_J^ is obtained. An exec__ Transmission Control Protocol/Internet Protocol (TCP/IP). 

plan_t is essentially a break-up of the ASCII plan into a list The client process 102 also comprises a tuple cache 106 for 

of segments . In accordance witn" steps 304 and 306, the retaining tuples received from the query coordinator 104. 

split -streams and scans and stores are inserted into the Abstract Data Types (ADTs) 11 6A and 116B can be stored 

exec_plan_t. The new plan can be executed by sequentially 15 and/or processed in either the query coordinator 104 or the 

executing each segment one after another. Each segment is client process 102. 

a list of operators that can be executed concurrently in a The client front end 108 permits the display of objects 

pipelined fashion. Thus, since each segment can be executed with spatial attributes on a 2-D map. For objects with 

c oncurrently by different aata servers 130rthe parallelism of multiple spatial attributes, one of the spatial attributes can be 

database system 100 is established. A t step 310, memory 20 used to specify the position of the object on the screen. The 

needed for the individual operators of each segment are spatial ADTs currently supported include points, closed 

allocated. In one or more embodiments of the invention, the polygons, polylines, and raster images. 

MemoryManager is utilized for memory allocation. The client front end 108 can also present a layered display 

However, it should be noted that any memory management of overlapping spatial attributes from different queries or 

technique/program may be utilized in accordance with 25 tables. For example, one can display city objects that satisfy 

embodiments of the invention. a certain predicate (e.g. population>300K) in one layer on 

FIG. 4 is a flow chart illustrating the operation of dis- top of a second layer of country objects, 

patcher 204. Dispatcher 204 is responsible for dispatching The client front end 108 also allows the user to query 

and executing an exec_plan_t by starting segments on through a graphical interface; implicitly issuing spatial que- 

appropriate data servers 130. At step 400, dispatcher 204 30 ries by zooming, clicking, or sketching a rubber-banded box 

packs operator information and arguments into valises. A on the 2-D map. The graphical capabilities of the client can 

valise is a flat representation of scheduler data structures that be implemented using toolkits such as Tk/XU. Further, the 

can be transmitted over a network from one node to another user can query by explicitly composing ad-hoc queries in the 

node. Nodes may be viewed as units, systems, or any other database system's 100 extended SQL syntax, 

structure in database system 100. For example, QC 104, 35 The user can use the client front end 108 to browse the 

client program 102, data servers 130, and data storage objects from a table. In this mode, attributes are displayed as 

devices 132 are all nodes. ASCII strings. The front end 108 can also be used to update 

At step 402, dispatcher 204 sets up various endpoints so database objects. Objects) to be updated can be selected 

that each operator in the pipeline knows where to send its either by pointing-and-clicking on the 2-D map or by 

results. At step 404, a determination is made regarding 40 selecting via the textual browser. 

whether any more segments are left that have not been Finally, the client front end 108 can also be used to 

executed. If not, the process is complete at step 412. If perform general catalog operations including browsing, cre- 

segments are left, a determination is made as to whether the ating new databases, defining new tables, creating indices on 

segment contains any operators that have not been executed. attributes, and bulk loading data into tables from external 

If there are no operators left in the current segment, pro- 45 files. 

cessing continues at step 404. The database system 100 also advantageously uses a 

However, if operators are left, the operator is started at second communication path 140 to transmit selected data 

step 408. At step 410, the results for the operator are such as master object data and large objects to the client 102, 

transmitted to the endpoint set-up in step 402 and dispatcher as described further below. The direct data transfer module 

204 waits for the status reports from the operator. Processing 50 142 in the client 102 receives this data, 

then continues at step 406. FIG. 5 is a diagram showing one embodiment of the user 

In one or more embodiments of the invention, the plan front end of the exemplary environment depicted in FIG. 1. 
(e.g., exec_plan_t) is set up in such a manner, that the The client front end 108 comprises a map view 502, layer 
top-most operator in the query writes its output to a query manager 504, browser 506 and a query composer 508. The 
tree. A query tree is a hierarchical structure that provides the 55 map view 502 is responsible for displaying and manipulat- 
ability to order the results/output from a query. After the ing objects contained in one or more layers. The current 
query outputs the results to the query tree, the query tree position of the cursor is continuously displayed in a sub- 
stores the results in a temporary file on a disk. window in units of the map projection system. Users can 

Thus, in accordance with one or more embodiments of the point and click on displayed objects to view their non-spatial 

invention, dispatcher 204 starts up all operators in a 60 attributes. The layer manager 504 is responsible for adding, 

segment, transmits the results, and waits for the status deleting, hiding, and reordering layers displayed by the map 

reports from each operator started. Additionally, after all of view 502. Each layer corresponds to a table of objects 

the operators in a segment have completed execution, dis- produced by executing some query. The extent browser 506 

patcher 204 repeats the process for all of the other segments allows a user to view any database table and adjust the way 

in the exec__plan t. 65 it should be displayed by the map view 502. The selected 

Referring back to FIG. 1, client program 102 comprises a table becomes a new layer with its spatial attributes display- 
front end 108, which provides a graphical user interface that able via the map view 502. 
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Hie query composer 508 allows a user to compose a SQL transport connection with the client 102 via the second 

query using a simple text editor. The RPC 114 is the communication path 140 using the client IP address and port 

interface to the query coordinator 104. It ships SQL queries number. The client 102 accepts 628 the transport connection, 

to the query coordinator 104 for execution and retrieves FIG. 6 B shows the client data server transmitting 632 the 

result tuples into the cache 510. The cache 510 comprises a 5 master data via the transport connection on tnc "second 

master data cache 510A, a metadata cache 510B and an communication path 140 to the client 102. In on e 

object cache 510C. The object cache 510C caches the result * embodiment, the master data^mclu de s information such as 

of a query in formats understood by the map view 502. The the dim ensions an d size of the related mas te r object, a nd an 

metadata cache 510B stores the catalog information of the ob ject io^ntmcatiorrTAblD) for the data objects associate d 

currently open database. The master data cache 510A stores 10 with the master data object. 

retrieved master data as described further below. In one TheJtser may then request one or more data objects in the 

embodiment, the object cache 510C also caches the objects array tha L ^cumulativelY represents the master data objecl . 

downloaded from the data servers 130. FIG. 5 also shows the Tfrjs, data is re presented in one of the data"o^jects, hayin g 

second communication path 140 from the data server 130 to related object identifications (AOIDs) that were transmit ted 

the user front end 108 via the direct data transfer module 15 wi thihejnaster data o&ject. T hat request can be submitted by 

142. transmitting a message including the AOIDs of the data 

Array-based abstract data types (ADTs) can be used as objects of interest and the address of the client 102 to the 

basis for a number of useful data types, includiDg BLOBs, query coordinator 104. T he query coordinator 104 again 

CLOBs, video, audio, text, image, maps and other large -geTrcraies riJo a simple execution plan designed to retrieve 

objects. Array-based ADT use an external out-of-line stor- 20 the data objects of interest from the data server 130. The 

age for very large objects. execution plan and the address of the client 102 are sent to 

FIG. 6A is a diagram illustrating one or more .embodi- the data server 130, which retrieves 638 the objects from the 

ments of the present invention. The client 102 transmits 602 storage manager. In one embodiment, the data server uses 

a message to the query coordinator 104. The message the client address to initiate 644 a transport connection with 

includes a database query, and may optionally include client 25 the client 102. The client 102 accepts 642 the transport 

address information. The query coordinator 104 receives connection. In another embodiment of the present invention, 

606 the message and using the information in the message, the transport connection originally established with the 

generates 608 an execution plan (i.e., query optimizer 126 client 102 (in block 626) is maintained for the duration of the 

generates the ASCII plan). The execution plan is then parsed session with the client 102. In this embodiment, there is no 

into one or more portions, each of which represents an 30 need to re-establish the transport connection between the 

execution plan thread (an exec_4>lan_t) (i.e., the query data server 130 and the client 102. In any case, the data 

scheduler 122 using parallelizer 202 and dispatcher 204 server 130 transmits 648 the requested data object, and the 

parallelize and provide for execution of the plan). These are client 102 receives it through the second communication 

transmitted to the data servers 130, which receive 610 and path 140. 

execute 612 their respective portions of the execution plan. 35 Using the foregoing technique, all of the AOIDs for the 

Results from the data servers 130 are transmitted to the master data are collected across all of the tuples in the 

query coordinator 104, which compiles the results into a current query result set, and are fetched all in one batch, 

query result having a result set. This information is trans- Similarly, the array data objects can be fetched in batch 

milted 614 back to the client 102. In one embodiment, this mode. The same approach can be applied to the real objects 

query result information includes an object identification 40 themselves. 

(OID) for the master data (MOID). Other query result Stored Procedures 

information may also be included, such as the dimensions One or more embodiments of the invention allow users to 

and size of the master data. specify and attach arbitrary stored procedures to the existing 

It is important to note that master data associated with the datatypes in data servers 130. As described above, stored 

MOID (that which is responsive to the database query) can 45 procedures are typically utilized or invoked within the 

be quite large. Hence, while the master data could be RDBMS (Relational Database Management System), 

delivered via the query coordinator 104 to the client 102, Stored procedures can be supplied from many sources, 

direct transfer from the data server in such cases best Database users can write stored procedures. Stored proce- 

accomplishes such delivery. dures that perform a commonly used set of commands can 

Returning to FIG. 6A, the client transmits 618 a request 50 also be supplied by the DBMS owner such as NCR 
for the master data. Included in this request is the OID for Corporation, the assignee of the present invention, 
the master data (the MOID). In one embodiment, this client Alternatively, stored procedures can be supplied by third- 
address information is globally unique and includes the party vendors. 

client address and port. Nested Execution of SQL from Stored Procedures 

The query coordinator 104 accepts this message, and from 55 Parallel execution of stored procedures relies on the fact 

the information in the message, generates 620 another that all stored procedures are executed on the various data 

execution plan. This execution plan (which includes the servers 130 in the parallel system 100. In other words, the 

client global address) is simpler than the one generated stored procedures are associated with and executed on 

previously (to respond to the database query), but particular data servers 130. As described above, it is desir- 

nonetheless, still requires the generation of a fetch object 60 able that the parallelization and execution of the SQL from 

operator. Alternatively, the MOID can be transmitted from a stored procedure remain transparent to the stored proce- 

the client 102 to the data server 130 without intervention of dure such that the user defining the stored procedure does 

the query coordinator 104, or the query coordinator 104 can not know, need to know, or maintain knowledge of the 

simply accept the request* for the master data and pass it underlying parallelization of the DBMS, 

along to the data server 130. 65 As described above, since stored procedures are executed 

The data server .130 receives 622 and executes 624 the on individual data servers 130, it is problematic to obtain 

applicable portion of the execution plan, and initiates 626 a and execute queries from within the stored procedure on one 


05/24/2004, EAST Version: 1.4.1 


US 6,507,834 Bl 

13 14 

data server 130 when the data that is the subject of the SQL flattened and packed in a valise. Thereafter, the exec_ 

of the stored procedure may reside on different data servers plan_J can either be stored in the catalogs or shipped to data 

130. Such problems are further exacerbated since the execu- servers 130. In one or more embodiments of the invention, 

lion of the query needs to remain transparent to the stored the plan can be shipped to a data server 130 as a part of a 

procedure. 5 predicate (of an SQL query). Further, prior to shipping the 

Further, using the embodiments described above, when plan to data server 130, the endpoints may be set up by QC 

executing SQL commands/queries, the system catalogs, as 104. 

described above, are needed. System catalogs are only As illustrated in FIG. 7, each data server 130A-130E can 

available to query coordinator 104 through catalog manager have an instance of a dispatcher object 204A-204E. Each 

118 and are not accessible from data servers 130. Thus, in 10 instance has the capability to receive a flattened execution 

the above embodiments, execution of an "inner" nested SQL plan (e.g., from QC 104) and execute the plan. In one or 

query from a stored procedure at a data server 130 can be a more embodiments of the invention, data server 130 and QC 

problem. 104 are running the same executable code (i.e., data server 

To execute nested SQL queries from stored procedures at 130 and QC 104 are each performing actions needed to 

a data server 130, one or more embodiments of the invention 15 execute the same code of the stored procedure), 

conduct preprocessing on the SQL query. In such Consequently, all of the dispatcher 204 code is available to 

embodiments, the preprocessing breaks up the execution of data server 130. 

the SQL query into two parts: (1) a part that requires access As described above, the stored procedure is executed on 

to system catalogs, and (2) a part that does not require access the various data servers 130A-130E. FIG. 8 is a flow chart 

to system catalogs. Subsequently, it is ensured that the first 20 illustrating the operation of various nodes in accordance 

part is executed at QC 104 before the query starts execution. with one or more embodiments of the invention. A step 800, 

The second part, which does not require access to the system execution of the stored procedure begins. At step 802, a 

catalogs, can then be safely executed at the local nodes (i.e., determination is made as to whether a nested SQL statement 

data servers 130). is encountered in the stored procedure. If not, execution of 

As described above, certain operations may only be 25 the stored procedure continues. If an SQL query is 

performed at certain nodes (e.g., catalog accesses may only encountered, the relevant portions of the query (e.g., the 

be performed by QC 104). To enable the paralellization of portions that cannot be executed on the data server 130) are 

the query execution, one or more embodiments of the forwarded to QC 104 at step 804. At step 806, QC 104 

invention parallelize and divide the operations needed to generates the ASCII plan (e.g., using query optimizer 126). 

execute the query and distribute the operations to either QC 30 At step 808, parallelizer 202 parallelizes the plan and 

104 or data server 130. generates an exec__plan_t. At step 810, the various end- 

Parallelizer 202 heavily relies on catalogs. Consequently, points are set up. At step 812, the exec_j>lan_t is sent to the 

the parallelizer 202 's functionality cannot be moved outside appropriate dispatcher 204A-204E for execution. Thus, 

of QC 104 (e.g., to data servers 130). However, dispatcher during execution of the stored procedure, the flattened plan 

204 does not need the catalogs. Thus, in accordance with one 35 (e.g., the exec_plan_t) can be sent to the dispatcher 204 for 

or more embodiments of the invention, dispatcher 204 is execution, 

moved to data server 130. FIG. 7 illustrates the data system Nested Query Result Retrieval 

100 as illustrated in FIG, 1 with dispatchers 204A-204E After execution of the query by the various dispatchers 

located on data servers 130A-130C. FIG. 7 also illustrates 204, the results of the query must be retrieved. Normally, the 

that parallelizer 202 still remains in query scheduler 122 of 40 top-most operator of a query outputs the results to a query 

QC 104. To move dispatcher 204 to data server 130, one or tree where the results are stored in temporary memory (e.g., 

more embodiments of the invention provide/permit a stored on a temporary file in a disk or in cache). In one or more 

procedure to utilize a dispatcher class for the execution of embodiments of the invention, the topmost operator sends 

queries. Thus, an instance of a dispatcher class (that may be the results back to the stored procedure instead of the query 

programmed in any programming language including object 45 tree. 

oriented programming languages such as C++) may be To enable transmission of the results to the stored 

instantiated at each data server 130. procedure, one or more embodiments of the invention utilize 

To separate the non-system catalog operations from the streams of information as described above. To receive a 

system-catalog operations (i.e., the move of dispatchers 204 stream of information, the stored procedure (or the QC 104 

to data servers 130), one or more embodiments of the 50 as described above) identifies/allocates an endpoint (Le., 

invention implement a two phase plan. In the first phase, a step 810 of FIG. 8) where the stored procedure can receive 

query is "prepared" for execution. Phase one is executed at input. The stored procedure then sets up an input stream that 

QC 104. The "preparation" of the query includes all actions obtains input from the specified endpoint. Once the endpoint 

performed by QC 104 as described above except for actions is designated, the stored procedure instructs the dispatcher to 

performed by dispatcher 204. Thus, phase one includes 55 dispatch the SQL query in such a way that the topmost 

query optimization and the generation of an ASCII plan by operator of the query sends all of its output to the specified 

query optimizer 126. Additionally, phase one includes the endpoint. 

parallehzationof the query by parallelizer 202 (i.e., all of the In one or more embodiments of the invention, the fol- 

steps of FIG. 3) resulting in the generation of the exec_ lowing stored procedure code illustrates the use of input 

plan__t as described above. 60 streams and endpoints to receive query results: 

After phase one is complete, no more catalog look-ups are 
needed for execution of the query. However, some steps 
described above that are performed by dispatcher 204 (e.g., 

steps 400 and 402) do not utilize the system catalogs. Tp^- m?JL 0; 

Accordingly, QC 104 may perform either or both steps 400 65 bascAdt • parameters - cuntnt_iiicral_valuca 0; 

and 402 of FIG. 4. Thus, in accordance with one or more dispatcher- >execut£ (plan, ep, parameters); 
embodiments of the invention, the exec__plan_J can be 
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-continued 

[nStrcam * stream ■ create stream (cp); 

while (tuple - stream. getNexfRiple 0) 
do_stuff (tuple); 


The example code above provides for the execution of 
various operations. The first line ("Endpoint *ep-create_ep 
( )f) provides for the creation of an endpoint called "ep". 
The line executes the create_ep( ) procedure and returns the 
result of the value. The result may be referenced by using the 
pointer to the result called ep. In accordance with embodi- 
ments of the invention, the endpoint creation and assignment 
may be performed at the QC node 104 or may be performed 
at the data server node 130. 

Hie second line ("exec_j>lan_t *plan=my_plan ( );") 
provides for obtaining an exec_j>lan__t (or a pointer to an 
exec__plan_t) called "plan" by calling a function called 
"my_plan " In accordance with embodiments of the 
invention, the exec_^plan_t is created by parallelizer 202 at 
the QC node 104. 

The third line ("baseAdt *parameters*>currentJiteraJ 

values ( ) ;") provides for obtaining the current values (or 
pointers to the current values) of abstract data types (ADT^ 
and placing (pointing to) them in a "parameters" variable. 

The fourth line ("dispatcher->execute (plan, ep, 
parameters) ;") provides for execution of the query by the 
dispatcher — the dispatcher 204 executes an operation from 
the specified plan using the specified parameters and outputs 
the result to the specified endpoint ep. In accordance with 
embodiments of the invention, dispatcher 204*8 functions 
are performed at the data server node 130. 

The fifth line ("Instream *stream-create_stream (ep);") 
creates an input stream (using the create_stream method 
and the pointer ep as a parameter) that will obtain/attach to 
the stream of information at the specified endpoint ep. The 
input stream may be referenced by using the pointer to the 
input stream called "stream". 

The sixth and seventh lines ("while (tuple- 
stream.getNextluple ( ))do_stuff (tuple);") provide for a 
loop that continues to execute while tuples are in the input 
stream. Thus, the lines enable a user to retrieve the next tuple 
(from the input stream) and perform an operation on the 
tuple. In this specific example, the do_stuff operation is 
performed on the retrieved tuple. Thus, the stored procedure 
can retrieve the results from the executed operation (the 
operation executed by the dispatcher in line 4) by examining 
the tuple on the input stream. Further, in accordance with 
one or more embodiments of the invention, the input stream 
class (i.e., "InStream") may be provided to the stored 
procedure so that the stored procedure may interface with 
the input stream relatively simply. 

FIG. 9 is a flow chart illustrating the retrieval of query 
results in accordance with one or more embodiments of the 
invention. At step 900, the various dispatcher instances 
204A-204E output their results to a single dispatcher 
instance 204. At step 902, the single dispatcher instance 204 
(at a particular data server 130) merges the results from the 
parallel execution of the various data servers 130 to produce 
a stream of tuples. At step 904 the merged stream of tuples 
are output to the specified endpoint. At step 906, the stored 
procedure obtains an input stream from the endpoint. At step 
908, each tuple may be obtained from the input stream by the 
stored procedure. Execution of the stored procedure contin- 
ues at step 910 (e.g., the stored procedure can perform 
operations on the tuple or continue executing other SQL or 
non-SQL commands). 
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By implementing the invention as described above, the 
stored procedure docs not need to know about the parallel- 
ization and execution of the query performed by QC 104 and 
data server 130. Instead, the stored procedure merely uses a 

5 dispatcher object to take and execute the query, and retrieves 
the results using a simple interface to an input stream. 

Thus, in accordance with one or more embodiments of the 
invention, the query is parallelized, the dispatcher executes 
the query, sets up the communication links between the 

10 various operators in the query, and ensures that all of the 
results arc sent back to the data server 130 that originated the 
query request At the data server 130 that originated the 
request, the dispatcher merges the results of the parallel 
execution and produces a single stream of tuples that is fed 

15 to the calling stored procedure using an input stream. 
Hardware and Software Embodiments 

Referring back to FIGS. 1 and 7, an exemplary hardware 
and software environment is shown that could be used to 
implement one or more embodiments of the invention. Each 

20 node (i.e., client program 102, QC 104, data server 130, and 
data storage device 132) may each comprise a computer that 
may have a processor and a memory, such as random access 
memory (RAM). Additionally, each node may be opera- 
tively coupled to a display, which presents images such as 

25 windows to the user on a graphical user interface (e.g., user 
front-end 108). Each node may be coupled to other devices, 
such as a keyboard, a mouse device, a printer, etc. Of course, 
those skilled in the art will recognize that any combination 
of the above components, or any number of different 

30 components, peripherals, and other devices, may be used 
with embodiments of the invention. 

Generally, each node may operate under the control of an 
operating system stored in memory. In one or more 
embodiments, the client program 102 (and the other nodes) 

35 interfaces with the user to accept inputs and commands and 
to present results through a graphical user interface (GUI) 
(e.g., user front-end 108). The instructions performing the 
GUI functions can be resident or distributed in the operating 
system, a computer program, or implemented with special 

40 purpose memory and processors. Embodiments of the inven- 
tion may also implement a compiler that allows an applica- 
tion program written in a programming language such as 
COBOL, C++, FORTRAN, or other language to be trans- 
lated into processor readable code. After completion, the 

45 application may access and manipulate data stored in 
memory using the relationships and logic that was generated 
using the compiler. Embodiments of the invention also 
optionally comprise an external communication device such 
as a modem, satellite link, Ethernet card, or other device for 

so communicating with other computers. 

In one embodiment, instructions implementing the oper- 
ating system, the computer program, the compiler, and 
database records, tables and information are tangibly 
embodied in a computer-readable medium, e.g., data storage 

55 devices 132, which could include one or more fixed or 
removable data storage devices, such as a zip drive, floppy 
disc drive, hard drive, CD-ROM drive, tape drive, etc. 
Further, such information may be partitioned across multip le 
InftTstorage devices 13i aS d escribed above. The operating 

60 system and the computer program may be comprised of 
instructions which, when read and executed by a computer, 
cause the computer to p ertorm the steps nece ssary to imple- 
ment and/or use tfte present invention. Computer programs 
and/or operating instructio ns may also be tangib ly embo died 

65 in memory and/or data communica tions devices, thereb y 
making a com puter program p^odu£ jjk^ticLe_pXjBanufac- 

* hire, acmrding to th5 invention. As such, the terms "article 
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of manufacture," "program storage device" and "computer 
program product" as used herein are intended to encompass 
a computer program accessible from any computer readable 
device or media. 

Those skilled in the art will recognize many modifications 
may be made to this configuration without departing from 
the scope of the present invention. For example, those 
skilled in the art will recognize that any combination of the 
above components, or any number of different components, 
peripherals, and other devices, may be used with the present 
invention. 

CONCLUSION 

This concludes the description of one or more embodi- 
ments of the invention. In summary, the invention describes 
a method, apparatus, and article of manufacture for parallel 
execution of SQL from within stored procedures. 

The method comprises providing a stored procedure with 
a class called dispatcher that can take a SQL query and start 
parallel execution of the query. The query is parallelized. 
The dispatcher executes the query, sets up the communica- 
tion links between the various operators in the query, and 
ensures that all of the results are sent back to the data server 
that originated the query request. At the data server that 
originated the request, the dispatcher merges the results of 
the parallel execution, and produces a single stream of tuples 
that is fed to the calling stored procedure using an input 
stream class. The input stream class provides the stored 
procedure with a simple and easy-to-use interface to access 
the results of the nested SQL execution. 

Other embodiments of the invention include a program 
storage device tangibly embodying instructions for perform- 
ing the method steps above on a computer or similar device, 
and an apparatus for performing the foregoing operations. 

The foregoing description of the preferred embodiment of 
the invention has been presented for the purposes of illus- 
tration and description. It is not intended to be exhaustive or 
to limit the invention to the precise form disclosed. Many 
modifications and variations are possible in light of the 
above teaching. It is intended that the scope of the invention 
be limited not by this detailed description, but rather by the 
claims appended hereto. The above specification, examples 
and data provide a complete description of the manufacture 
and use of the composition of the invention. Since many 
embodiments of the invention can be made without depart- 
ing from the spirit and scope of the invention, the invention 
resides in the claims hereinafter appended. 

What is claimed is: 

1. A method of parallel execution of SQL in a stored 
procedure comprising: 
executing a stored procedure on a first data server; 
determining when a nested SQL command is encountered 

in the stored procedure; 
transmitting at least a portion of the nested SQL command 

from the first data server to a query coordinator, 
the query coordinator generating an execution plan that 

provides for parallel execution of the portion of the 

nested SQL command; 
transmitting at least a portion of the execution plan from 

the query coordinator to a plurality of dispatcher 

instances on a plurality of data servers; 
the plurality of dispatchers executing the transmitted 

portions of the execution plan on the plurality of data 

servers; 

the plurality of dispatchers transmitting results from the 
executed portions of the execution plan to an input 
stream at an endpoint on the first data server; and 
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the stored procedure obtaining the results at the endpoint 
from the input stream. 

2. The method of claim 1 wherein the transmitting at least 
a portion of the nested SQL command, the transmitting at 
least a portion of the execution plan, and the query coordi- 
nator are transparent to the stored procedure. 

3. The method of claim 1 further comprising: 

one or more of said plurality of dispatchers merging 

results of the execution; 
specifying the endpoint to retrieve the merged results; and 
providing the stored procedure with a mechanism for 

retrieving the results in the input stream from the 

endpoint. 

4. The method of claim 3 wherein the input stream 
comprises one or more tuples. 

5. The method of claim 1 wherein said generating further 
comprises: 

a query optimizer generating an ASCII plan; and 
a parallelizer generating a parallel execution plan. 

6. A system for parallel execution of SQL in a stored 
procedure comprising: 

(a) a stored procedure comprised of one or more nested 
SQL commands; 

(b) a first data server configured to: 

(i) execute the stored procedure; 

(ii) determine when a nested SQL command is encoun- 
tered in the stored procedure; 

(iii) transmit at least a portion of the nested SQL 
command from the first data server to a query 
coordinator; 

(c) a plurality of dispatchers on a plurality of data servers 
configured to: 

(i) receive portions of an execution plan; 

(ii) execute the received portions of the execution plan 
on the plurality of the data servers; 

(iii) transmit results from the executed portions of the 
execution plan to an input stream at an endpoint on 
the first data server, wherein the stored procedure is 
further configured to obtain the results at the end- 
point from the input stream; 

(d) the query coordinator configured to: 

(i) receive the at least a portion of the nested SQL 
command; 

(ii) generate the execution plan that provides for par- 
allel execution of the portion of the nested SQL 
command; and 

(iii) transmitting at least a portion of the execution plan 
. to a plurally of the dispatchers on a plurality of the 

data servers. 

7. The system of claim 6 wherein the query coordinator 
and the operations of the dispatcher ate transparent to the 
stored procedure. 

8. The system of claim 6 wherein the dispatcher is further 
configured to merge one or more results of execution of said 
execution plan. 

9. The system of claim $ wherein the input stream 
comprises one or more tuples. 

10. The system of claim 6 further comprising: 

a query optimizer in the query coordinator, the query 
optimizer configured to generate an ASCII plan from 
the portion of nested SQL command; and 

parallelizer in the query coordinator, the parallelizer con- 
figured to generate a parallel execution plan from said 
ASCII plan. 

11. An article of manufacture for parallel execution of 
SQL in a stored procedure comprising: 


05/24/2004, EAST Version: 1.4.1 


US 6,507,834 Bl 


19 


20 


means for executing a stored procedure on a first data 
server, 

means for determining when a nested SQL command is 
encountered in the stored procedure; 

means for transmitting at least a portion of the nested SQL 
command from the first data server to a query coordi- 
nator; 

means for the query coordinator to generate an execution 
plan that provides for parallel execution of the portion 
of the nested SQL command; 

means for transmitting at least a portion of the execution 
plan from the query coordinator to a plurality of 
dispatcher instances on a plurality of data servers; 

means for the plurality of dispatchers to execute the 
transmitted portions of the execution plan on the plu- 
rality of data servers; 

means for the plurality of dispatchers transmitting results 
from the executed portions of the execution plan to an 
input stream at an endpoint on the first data- server; and 

means for the stored procedure to obtain the results at the 
endpoint from the input stream. 

12. The article of manufacture of claim 11 wherein the 
means for transmitting the portion of the nested SQL 
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command, the means for transmitting at least a portion of the 
execution plan, and the query coordinator are transparent to 
the stored procedure. 

13. The article of manufacture of claim 11 further com- 
prising: 

means for one or mote of said plurality of dispatchers to 

merge results of the execution; 
means for specifying the endpoint to retrieve the merged 

results; and 

means for providing the stored procedure with a mecha- 
nism for retrieving the results in the input stream from 
the endpoint. 

14. Hie article of manufacture of claim 13 wherein the 
input stream comprises one or more tuples. 

15. The article of manufacture of claim 11 wherein said 
means for generating further comprises: 

means for a query optimizer to generate an ASCII plan; 
and 

means for a parallelizer to generate a parallel execution 
plan. 
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It is certified thai error appears in the above-identified patent and that said Letters Patent is 
hereby corrected as shown below: 


Column 18, 

Line 47, delete "transmitting", and substitute - transmit 
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